setwd("~/Documents/2DataViz /SpokaneProviders ")
load("providerspokane.rda")
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyr)
In this analysis we wanted to learn more about the Medicare services in Spokane. We set out to learn about how gender of the provider, the type of provider and Place of Service affects the medical services you receive.
First, we want to Indentify our variables:
Gender.of.the.Provider Provider.Type Place.of.Service
We will compare these to the following variables:
Number.of.Services – num Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services – int Average.Medicare.Allowed.Amount – num Average.Submitted.Charge.Amount – num Average.Medicare.Payment.Amount – num Average.Medicare.Standardized.Amount – num
Next, we create subsets of the 3 main variables we are looking at in order to more easily work with the data. ###Gender - subset dataset
Genderstudy=providerspokane %>% select(Gender.of.the.Provider, Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, Average.Medicare.Allowed.Amount, Average.Submitted.Charge.Amount, Number.of.Services, Average.Medicare.Standardized.Amount, Average.Medicare.Payment.Amount)
Providorstudy=providerspokane %>% select(Provider.Type, Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, Average.Medicare.Allowed.Amount, Average.Submitted.Charge.Amount, Number.of.Services, Average.Medicare.Standardized.Amount, Average.Medicare.Payment.Amount)
Placestudy=providerspokane %>% select(Place.of.Service, Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, Average.Medicare.Allowed.Amount, Average.Submitted.Charge.Amount, Number.of.Services, Average.Medicare.Standardized.Amount, Average.Medicare.Payment.Amount)
Genderstudy[Genderstudy==""] <-NA
Genderstudy <- na.omit(Genderstudy)
First, we begin with gender comparisons to the data. This will allow us to see the differences between the gender of the providors to the various variables in the Genderstudy dataset.
head(Genderstudy)
## Gender.of.the.Provider
## 1 M
## 2 F
## 3 M
## 4 M
## 5 M
## 6 M
## Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services
## 1 107
## 2 725
## 3 1410
## 4 68
## 5 79
## 6 91
## Average.Medicare.Allowed.Amount Average.Submitted.Charge.Amount
## 1 50.31112 75
## 2 39.02793 75
## 3 10.58000 34
## 4 139.01324 315
## 5 60.71671 145
## 6 26.36143 100
## Number.of.Services Average.Medicare.Standardized.Amount
## 1 107 35.71421
## 2 1587 26.43543
## 3 1410 10.37000
## 4 68 110.36559
## 5 79 52.73152
## 6 91 18.23934
## Average.Medicare.Payment.Amount
## 1 34.49804
## 2 29.09792
## 3 10.34608
## 4 104.08838
## 5 42.38532
## 6 18.15374
library(ggplot2)
ggplot(Genderstudy,aes(x=Number.of.Services,y=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .) + labs(list(title = "Number of Services & Number of Distinct Medicare Beneficiary Per Day Services for Gender", x = "# of Services", y = "# of Distinct Beneficiary Services(per day)"))
Results: Female and Male providors have a similar pattern in performing beneficiary services. The majority of services performed tend to be beneficary services for both genders.
ggplot(Genderstudy,aes(x=Number.of.Services,y=Average.Medicare.Allowed.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Services & Average Medicare Allowed Amount for Gender", x = "# of Services", y = "Medicare Allowed(mean)"))
Results: More men providors perform services in which medicare is allowed compared to women.
ggplot(Genderstudy,aes(x=Number.of.Services,y=Average.Submitted.Charge.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Services & Average Submitted Charge Amount for Gender", x = "# of Services", y = "Submitted Charge(mean)"))
ggplot(Genderstudy,aes(x=Number.of.Services,y=Average.Medicare.Payment.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Services & Average Medicare Payment Amount for Gender", x = "# of Services", y = "Medicare Payment(mean)"))
Results: The amount of Submited charge my patients is much lower than what is actually paid by Medicare. Applicant’s tended to submit between 0 and 5,000 but were only covered for 2,000 or less. Those people with many procedures submitted lower charges and were covered more next to nothing.
ggplot(Genderstudy,aes(x=Number.of.Services,y=Average.Medicare.Standardized.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Services & Average Medicare Standaradized Amount for Gender", x = "# of Services", y = "Standard Medicare(mean)"))
Similar results to Medicare Payment.
ggplot(Genderstudy,aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, y=Average.Medicare.Allowed.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Distinct Medicare Beneficiary Per Day Services & Average Medicare Standardized Amount for Gender", x = "# of Distinct Beneficiary Services(per day)", y = "Standard Medicare(mean)"))
Results: Procedures covered by Medicare tend to have a greater number of participants than number of services not covered my medicare. There is a higher concentration of male providors that provide services for people that have medicare.
ggplot(Genderstudy,aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, y=Average.Submitted.Charge.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Distinct Medicare Beneficiary Per Day Services & Average Submitted Charge Amount for Gender", x = "# of Distinct Beneficiary Services(per day)", y = "Submitted Charge(mean)"))
ggplot(Genderstudy,aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, y=Average.Medicare.Payment.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Distinct Medicare Beneficiary Per Day Services & Average Medicare Payment Amount for Gender", x = "# of Distinct Beneficiary Services(per day)", y = "Medicare Payment(mean)"))
Results: More beneficiary services are provided my males. There is once again a much greater number of submitted charges to Medicare payment than what is actually covered by Medicare.
ggplot(Genderstudy,aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, y=Average.Medicare.Standardized.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Number of Distinct Medicare Beneficiary Per Day Services & Average Medicare Standardized Amount for Gender", x = "# of Distinct Beneficiary Services(per day)", y = "Standarized Medicare(mean)"))
Results: Higher number of average standardized medicare amount for male providors. Male providors tend to perform more procedures per day.
ggplot(Genderstudy,aes(x=Average.Medicare.Allowed.Amount, y=Average.Submitted.Charge.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Medicare Allowed Amount & Average Submitted Charge Amount for Gender", x = "Medicare Allowed(mean)", y = "Submitted Charge(mean)"))
Results: Patients with higher amounts of Medicare allowed tend to submit a higher charge. There is a positive correlation between these variables.
ggplot(Genderstudy,aes(x=Average.Medicare.Allowed.Amount, y=Average.Medicare.Payment.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Medicare Allowed Amount & Average Medicare Payment Amount for Gender", x = "Medicare Allowed(mean)", y = "Medicare Payment(mean)"))
Results: The amount of Medicare Allowed correlates to the Medicare payment for each applicant. The majority of these procedures are performed by males.
ggplot(Genderstudy,aes(x=Average.Medicare.Allowed.Amount, y=Average.Medicare.Standardized.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Medicare Allowed Amount & Average Medicare Standardized Amount for Gender", x = "Medicare Allowed(mean)", y = "Standardized Medicare(mean)"))
ggplot(Genderstudy,aes(x=Average.Submitted.Charge.Amount,y=Average.Medicare.Payment.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Submitted Charge Amount & Average Medicare Payment Amount for Gender", x = "Submitted Charge(mean)", y = "Medicare Payment(mean)"))
ggplot(Genderstudy,aes(x=Average.Submitted.Charge.Amount,y=Average.Medicare.Standardized.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Submitted Charge Amount & Average Medicare Standardized Amount for Gender", x = "Submitted Charge(mean)", y = "Standardized Medicare(mean)"))
Results: Displayed more clearly, the amount of charge sumbitted to be paid by Medicare is significantly less. There exists a someone possitive relationship between the variables indicating that the higher the submitted charge the higher the amount Medicare will cover.
ggplot(Genderstudy,aes(x=Average.Medicare.Payment.Amount,y=Average.Medicare.Standardized.Amount)) + geom_point(shape=1) + facet_grid(Gender.of.the.Provider ~ .)+ labs(list(title = "Average Medicare Payment Amount & Average Medicare Standardized Amount for Gender", x = "Medicare Payment(mean)", y = "Standardized Medicare(mean)"))
Results: Another possitive correlation between variables. This shows the standarized medicare amount trends along side the medicare payment varaible.
Using: Provider.Type
Variables: () -Number of Services -Distinct beneficiary per day services -Average medicare allowed -Charged -Paid amount -Medicare standardized amount
Providerstudy=providerspokane %>% select( Provider.Type, Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, Average.Medicare.Allowed.Amount, Average.Submitted.Charge.Amount, Number.of.Services, Average.Medicare.Standardized.Amount, Average.Medicare.Payment.Amount)
ggplot(Providerstudy, aes(x= Provider.Type, y=Average.Medicare.Allowed.Amount)) + geom_bar(stat="identity", alpha=.60) + coord_flip() + labs(list(title = "Type of Provider and the Average Amount of Medicare Allowed ", x = "Provider Type", y = "Average Medicare Allowed Amount"))
Diagnostic radiology has the highest medicare amount allowed. Diagnostic radiology includes MRI, CT, x-ray, and ultra-sound scans that are commonly used everyday, thus the high medicare allowed fits. Ambulatory Surgery Center is ranked 2nd for the highest medicare allowed and accounts for same day sugical care, many diagnostic and preventive procedures. Many of these come in through the ER, which is extremely busy on a day-to-day basis. Some of the providers that do not allow high medicare coverage include: Plastic and Recontructive Surgery, Hospice and Pallitive Care, Geriatric Medicine, and Mammogram Screening Center. High percentages of plastic surgery is elective, so it is logical that it is low. But it is suprising to see hospice and pallitive care and geriatric medicine having such low medicare allowed comparitiveley, as medicare is colloquially known as a federal program for helping older people’s medical fees. Mammograms cost approxiametely $100, so overall, it needs less medicare to go towards it comparatively.
ggplot(Providerstudy, aes(Number.of.Services)) + geom_density( alpha = 0.8)+facet_wrap(~ Provider.Type,scales="free")
meancharge=Providerstudy%>%group_by(Provider.Type)%>%summarize(mean.charge.amount =mean(Average.Submitted.Charge.Amount))
ggplot(data = meancharge, aes(x = mean.charge.amount, y = reorder(Provider.Type, mean.charge.amount))) +
geom_point() + labs(list(title = "Type of Provider and the Mean of Average Submitted Charge Amounts", x = "Mean Submitted Charge Amounts", y = "Provider Type"))
Looking at the graph, it is not surprising that ambulance Service Supplier has the highest mean charge amount with the high costs of ambulances yet the cost of an ambulance higher than $2,500 is this incredibly high. The next highest is the Ambulatory Sugery Center, which deals with day to day surgerys. It is notable that the mean cost of an ambulance is higher than the mean cost of most surgerys. Cardiac Surgeries is the third highest submitted charge amount, which can be justified by the high skill necessary and high risk of heart surgeries. The IQR settles mostly between 25 and 750 dollars.
meandistinctperdayservices=Providerstudy%>%group_by(Provider.Type)%>%summarize(mean.perday.services =mean(Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services))
ggplot(meandistinctperdayservices, aes(x = mean.perday.services, y =reorder(Provider.Type,mean.perday.services))) + geom_point() + labs(list(title = "Type of Provider and the Number of Distinct Medicare Beneficiary Per Day Services", x = "Mean of Number of Distince Medicare Beneficiary Per Day Services", y = "Provider Type"))
The mean number of distinct medicare beneficiaries per day services is concentrated in the 0-250 range, with outliars of Clinical labratories above 1750, and Ambulance Service Provider upwards of 2250, almost at 2500.
To look more closely on the groupings we excluded the outliers in the following graph.
ggplot(meandistinctperdayservices, aes(x = mean.perday.services, y =reorder(Provider.Type,mean.perday.services))) + geom_point() + xlim(0,500) + labs(list(title = "Type of Provider and the Number of Distinct Medicare Beneficiary Per Day Services", x = "Mean of Number of Distince Medicare Beneficiary Per Day Services", y = "Provider Type"))
## Warning: Removed 2 rows containing missing values (geom_point).
The above graph shows that most types of providers will perform be in the range of 25-125 beneficiary services per day.
ggplot(meandistinctperdayservices, aes(x = mean.perday.services, y = reorder(Provider.Type,mean.perday.services))) + geom_point() + xlim(0,500) + geom_density(fill = "blue") + labs(list(title = "Type of Provider and the Number of Distinct Medicare Beneficiary Per Day Services", x = "Mean of Number of Distince Medicare Beneficiary Per Day Services", y = "Provider Type"))
## Warning: Removed 2 rows containing non-finite values (stat_density).
## Warning: Removed 2 rows containing missing values (geom_point).
The lines added on the above graph show the highly concentrated areas, showing that the most provider types perform approxiametely 50-110 beneficiary services per day.
The next graph shows how many services are provided by each Provider type:
numberofservices=Providerstudy%>%group_by(Provider.Type)%>%summarize(mean.services =mean(Number.of.Services))
ggplot(data = numberofservices, aes(x = mean.services, y =reorder(Provider.Type,mean.services))) +
geom_point() + labs(list(title = "Type of Provider and the Number of Services", x = "Mean of Number Services", y = "Provider Type"))
In the next graph I excluded Ambulance Service Supplier, because of it was an outlier.
ggplot(data = numberofservices, aes(x = mean.services, y = reorder(Provider.Type,mean.services))) +
geom_point() + labs(list(title = "Type of Provider and the Number of Services", x = "Mean of Number Services", y = "Provider Type")) + xlim(0, 2250)
## Warning: Removed 1 rows containing missing values (geom_point).
Hematology and Oncology, which is the specialization in blood deseases and cancer, have high number of services, along with Clinical Labratory, which is the tests that are run to learn more about the patient. The IRQ falls mostly into the range of 5-45 services. This making the number of tests able to run in Clinical Labratory at higher than 2,000 shocking.
Placestudy=providerspokane %>% select(Place.of.Service, Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services, Average.Medicare.Allowed.Amount, Average.Submitted.Charge.Amount, Number.of.Services, Average.Medicare.Standardized.Amount, Average.Medicare.Payment.Amount)
head(Placestudy)
## Place.of.Service
## 1 F
## 2 F
## 3 O
## 4 O
## 5 O
## 6 O
## Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services
## 1 107
## 2 725
## 3 1410
## 4 68
## 5 79
## 6 91
## Average.Medicare.Allowed.Amount Average.Submitted.Charge.Amount
## 1 50.31112 75
## 2 39.02793 75
## 3 10.58000 34
## 4 139.01324 315
## 5 60.71671 145
## 6 26.36143 100
## Number.of.Services Average.Medicare.Standardized.Amount
## 1 107 35.71421
## 2 1587 26.43543
## 3 1410 10.37000
## 4 68 110.36559
## 5 79 52.73152
## 6 91 18.23934
## Average.Medicare.Payment.Amount
## 1 34.49804
## 2 29.09792
## 3 10.34608
## 4 104.08838
## 5 42.38532
## 6 18.15374
We want to compare the number of services variable to the number of beneficiary services variables regarding aver medicare payment submitted and the amount actually charged. In order to do this, we will create four facet graphs per varaiable to display potential differences.
The place of service variable identifies whether the service was performed at a facility or non facility. A facility is usually a hospital, or ambulance but can include facilities suck as a hospice, a skilled nursing facility or a community mental health center. For a more in-depth list please check Appendix C of the provided PDF. A non-facility is usually an office, but can be a pharmacy, homeless shelter, school, or an independent clinic. For a more in-depth list please check Appendix C of the provided PDF.
ggplot(Placestudy,aes(x=Number.of.Services,y=Average.Submitted.Charge.Amount)) + geom_point(shape=1) + facet_grid(Place.of.Service ~ .) + labs(list(title = "Number of Services & Average Submitted Charge Amount for Place of Service", x = "# of Services", y = "Submitted Charge(mean)"))
ggplot(Placestudy,aes(x=Number.of.Services,y=Average.Medicare.Payment.Amount)) + geom_point(shape=1) + facet_grid(Place.of.Service ~ .) + labs(list(title = "Number of Services & Average Medicare Payment Amount for Place of Service", x = "# of Services", y = "Medicare Payment(mean)"))
ggplot(Placestudy,aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services,y=Average.Submitted.Charge.Amount)) + geom_point(shape=1) + facet_grid(Place.of.Service ~ .) + labs(list(title = "Number of Distinct Beneficiary Services & Average Submitted Charge Amount for Place of Service", x = "# of Beneficiary Services", y = "Submitted Charge(mean)"))
ggplot(Placestudy, aes(x=Number.of.Distinct.Medicare.Beneficiary.Per.Day.Services,y=Average.Medicare.Payment.Amount),Average.Medicare.Payment.Amount) + geom_point(shape=1) + facet_grid(Place.of.Service ~ .) + labs(list(title = "Number of Distinct Beneficiary Services & Average Medicare Payment Amount for Place of Service", x = "# of BeneficiaryServices", y = "Medicare Payment(mean)"))
Results: There are more services provided in non-facilities. There appears to be more services covered by Medicare in facilities.